winch: respect the enable_nan_canonicalization setting by r-near · Pull Request #12939 · bytecodealliance/wasmtime

r-near · 2026-04-02T04:50:31Z

The enable_nan_canonicalization flag already flows through to Winch via the shared Flags, but Winch was ignoring it. This adds maybe_canonicalize_nan (scalar) and maybe_canonicalize_v128_nan (SIMD) methods to the Masm trait that, when the flag is set, emit a NaN-detecting sequence to replace results with the canonical quiet NaN.

Scalar: compare-with-self + conditional branch to load canonical NaN. Implemented for x64 and aarch64
SIMD (x64): unordered compare to build a NaN lane mask, then bitwise select with canonical NaN from the constant pool

Covers add, sub, mul, div, min, max, sqrt, ceil, floor, trunc, nearest, demote, and promote for both scalar and SIMD. Canonical NaN constants are shared across ISAs. Removes the canonicalize-nan.wast Winch skip.

Copilot

Pull request overview

This PR ensures Winch respects the existing enable_nan_canonicalization shared setting by adding a canonicalize_nan hook to the MacroAssembler trait and invoking it after scalar floating-point operations so NaN results are normalized to the canonical quiet-NaN bit pattern.

Changes:

Add MacroAssembler::canonicalize_nan and implement it for x64 and aarch64.
Invoke NaN canonicalization after scalar float arithmetic/conversion ops (and after rounding paths in the visitor).
Add a scalar WAST test to validate canonical NaN bit patterns.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`winch/codegen/src/visitor.rs`	Inserts `canonicalize_nan` after scalar float ops and adds helper for rounding results.
`winch/codegen/src/masm.rs`	Extends the `MacroAssembler` trait with `canonicalize_nan`.
`winch/codegen/src/isa/x64/masm.rs`	Implements NaN detection + replacement with canonical NaN on x64.
`winch/codegen/src/isa/aarch64/masm.rs`	Implements NaN detection + replacement with canonical NaN on aarch64; stores shared flags in the masm.
`tests/misc_testsuite/canonicalize-nan-scalar.wast`	Adds scalar coverage for canonical NaN bit patterns and propagation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

winch/codegen/src/isa/x64/masm.rs

winch/codegen/src/isa/aarch64/masm.rs

winch/codegen/src/visitor.rs

tests/misc_testsuite/canonicalize-nan-scalar.wast

github-actions · 2026-04-02T08:04:51Z

Subscribe to Label Action

cc @saulecabrera

Details

This issue or pull request has been labeled: "winch"

Thus the following users have been cc'd because of the following labels:

saulecabrera: winch

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

r-near · 2026-04-02T20:39:58Z

@cfallin Thanks for reviewing #12940! Would you mind taking a look at this one as well?

cfallin · 2026-04-02T21:01:47Z

@r-near we usually either stick with the assignment bot's suggestions (the reason I reviewed your last PR) or, in cases such as this one, tag a sub-area reviewer who knows part of the code best -- in this case @saulecabrera might be best to review Winch-specific work?

saulecabrera · 2026-04-02T22:04:27Z

I'll review.

saulecabrera

Looking good, some comments inline.

saulecabrera · 2026-04-06T13:54:39Z

winch/codegen/src/isa/aarch64/masm.rs

        Ok(())
    }

+    fn canonicalize_nan(&mut self, reg: WritableReg, size: OperandSize) -> Result<()> {


Given that the canonicalization here is optional, let's rename this method to maybe_canonicalize_nan.

Also, could you expand a bit on the rationale for adding a per-Masm implementation instead of an ISA-agnostic one? A priori, it seems that the MacroAssember exposes enough functionality to express canonicalization one layer above.

The trait doesn't have float-and-compare branch or float-constant-load, so an agnostic version would need to go through float_cmp_with_set plus an integer branch, which is a few more instructions than the native path each backend can do.

I was leaning per-ISA, but perhaps we just add those primitives to the trait?

BTW I added the shared constants so that should at least cover the duplication concern.

Renamed the method here: 7cdd58e (this PR)

saulecabrera · 2026-04-06T16:44:23Z

winch/codegen/src/isa/x64/masm.rs

+            OperandSize::S32 => &0x7FC00000u32.to_le_bytes(),
+            OperandSize::S64 => &0x7FF8000000000000u64.to_le_bytes(),


To avoid repeating these immediate values across ISAs, can we introduce a constant at the Masm layer and import them here instead?

Yep, here: 75cecea (this PR)

saulecabrera · 2026-04-06T17:22:28Z

tests/misc_testsuite/canonicalize-nan-scalar.wast

Could we also implement this functionality for the non-scalar counterparts? I believe that all vector instructions are implemented in order to respect canonicalization. Once implemented we should be able to fully test simd/canonicalize-nan.wast in x86-64, by removing this line https://github.com/bytecodealliance/wasmtime/blob/main/crates/test-util/src/wast.rs#L614

Ok here we go: cdef987 (this PR).

Added maybe_canonicalize_v128_nan to the masm trait and implemented it for x64

I removed the skip and it looks like the tests pass 🎉

saulecabrera · 2026-04-06T17:42:39Z

winch/codegen/src/visitor.rs

 where
    M: MacroAssembler,
 {
+    fn canonicalize_nan_for_round(&mut self, size: OperandSize) -> Result<()> {


Could we replace this function with a a call to self.masm.canonicalize_nan passing in the return register of the builtin function instead? The register holding the return value should be accessible via the builtins struct. That way we can avoid a function call (to canonicalize_nan_for_roundand going through all the hoops involved in pop_to_reg) and more importantly we reduce the risk of this function getting used in context different to round-result canonicalization, for which as far as I can tell, we are not enforcing any particular invariants.

Ended up removing the function here: 1722807 (this PR).

I kept the pop/canonicalize/push inline since pop_to_reg is essentially a no-op when the value is already in a register... idk, it looked like wiring up the return register through float_round felt more invasive, but that might be a more purist solution

saulecabrera

Looks good to me, thanks for iterating on this. One last request and then we can land this one.

saulecabrera · 2026-04-07T17:53:45Z

winch/codegen/src/isa/x64/masm.rs

        Ok(())
    }

+    fn maybe_canonicalize_nan(&mut self, reg: WritableReg, size: OperandSize) -> Result<()> {


I wonder if it would be worth expressing the scalar version as the vector counterpart (using a single lane), to make it branchless. We'd probably need to fallback to the branch version when avx instructions are not supported though. For now let's leave a comment here stating that in the future if a more performant version is needed, a variation of the v128 branchless could be considered for scalars.

Done! 63ad118 (this PR)

r-near · 2026-04-07T20:09:41Z

@saulecabrera could you re-add it to the merge queue? I fixed the test failure

saulecabrera · 2026-04-07T20:16:47Z

That is the right fix, yeah.

Behavior of Winch changed in bytecodealliance#12939 in such a way that it's breaking differential fuzz testing with wasmi. Now that Winch supports NaN canonicalization this commit adjusts the fuzz test configuration generation to additionally enable it for Winch in addition to Cranelift.

Behavior of Winch changed in #12939 in such a way that it's breaking differential fuzz testing with wasmi. Now that Winch supports NaN canonicalization this commit adjusts the fuzz test configuration generation to additionally enable it for Winch in addition to Cranelift.

Copilot AI review requested due to automatic review settings April 2, 2026 04:50

r-near requested review from a team as code owners April 2, 2026 04:50

r-near requested review from alexcrichton and removed request for a team April 2, 2026 04:50

Copilot started reviewing on behalf of r-near April 2, 2026 04:51 View session

r-near marked this pull request as draft April 2, 2026 04:52

Copilot AI reviewed Apr 2, 2026

View reviewed changes

winch/codegen/src/isa/x64/masm.rs Outdated Show resolved Hide resolved

winch/codegen/src/isa/aarch64/masm.rs Outdated Show resolved Hide resolved

winch/codegen/src/visitor.rs Outdated Show resolved Hide resolved

tests/misc_testsuite/canonicalize-nan-scalar.wast Show resolved Hide resolved

r-near force-pushed the winch-nan-canonicalization branch from fc5efb7 to 97d89bb Compare April 2, 2026 04:58

winch: respect the enable_nan_canonicalization setting

0c5e168

r-near force-pushed the winch-nan-canonicalization branch from 97d89bb to 0c5e168 Compare April 2, 2026 04:59

r-near marked this pull request as ready for review April 2, 2026 05:13

github-actions bot added the winch Winch issues or pull requests label Apr 2, 2026

add disas tests for NaN canonicalization

c476a6a

saulecabrera requested review from saulecabrera and removed request for a team and alexcrichton April 6, 2026 13:49

saulecabrera reviewed Apr 6, 2026

View reviewed changes

r-near added 5 commits April 6, 2026 11:05

rename canonicalize_nan to maybe_canonicalize_nan

7cdd58e

extract canonical NaN constants to shared masm module

75cecea

remove canonicalize_nan_for_round, inline at call sites

1722807

implement SIMD NaN canonicalization for x64

cdef987

cargo fmt

3649937

saulecabrera approved these changes Apr 7, 2026

View reviewed changes

add comment about branchless scalar canonicalization opportunity

63ad118

saulecabrera added this pull request to the merge queue Apr 7, 2026

add canonicalize-nan.wast to no-AVX skip list

9db6a79

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2026

saulecabrera added this pull request to the merge queue Apr 7, 2026

Merged via the queue into bytecodealliance:main with commit 9f93e49 Apr 7, 2026
48 checks passed

alexcrichton mentioned this pull request Apr 9, 2026

fuzz: Adjust where NaN canonicalization is configured #13030

Merged

		OperandSize::S32 => &0x7FC00000u32.to_le_bytes(),
		OperandSize::S64 => &0x7FF8000000000000u64.to_le_bytes(),

Conversation

r-near commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 2, 2026

Subscribe to Label Action

Uh oh!

r-near commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cfallin commented Apr 2, 2026

Uh oh!

saulecabrera commented Apr 2, 2026

Uh oh!

saulecabrera left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

saulecabrera left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

r-near commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saulecabrera commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

r-near commented Apr 2, 2026 •

edited

Loading

r-near commented Apr 2, 2026 •

edited

Loading

r-near commented Apr 7, 2026 •

edited

Loading